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Q-i ABSTRACT 

q , We present a new method that simultaneously solves for cosmology and galaxy bias 

5^ ■ on non-linear scales. The method uses the halo model to analytically describe the 

\ (non-linear) matter distribution, and the conditional luminosity function (CLF) to 

specify the halo occupation statistics. For a given choice of cosmological parameters, 
this model can be used to predict the galaxy luminosity function, as well as the two- 
point correlation functions of galaxies, and the galaxy-galaxy lensing signal, both as 
function of scale and luminosity. These observables have been reliably measured from 
1 the Sloan Digital Sky Survey. In this paper, the first in a series, we present the detailed, 

. analytical model, which we test against mock galaxy redshift surveys constructed from 

QO ■ high-resolution numerical TV-body simulations. We demonstrate that our model, which 

\& ' includes scale-dependence of the halo bias and a proper treatment of halo exclusion, 

reproduces the 3-dimensional galaxy-galaxy correlation and the galaxy-matter cross- 
f— ^ correlation (which can be projected to predict the observables) with an accuracy better 

I than 10 (in most cases 5) percent. Ignoring either of these effects, as is often done, 

results in systematic errors that easily exceed 40 percent on scales of ~ l/i -1 Mpc, 
where the data is typically most accurate. Finally, since the projected correlation 
functions of galaxies are never obtained by integrating the redshift space correlation 
function along the line-of-sight out to infinity, simply because the data only cover a 
jj] " finite volume, they are still affected by residual redshift space distortions (RRSDs). 

i gnoring these, as done in numerous studies in the past, results in systematic errors 
that easily exceed 20 perent on large scales (r p > 10ft. _1 Mpc). We show that it is fairly 
straightforward to correct for these RRSDs, to an accuracy better than ~ 2 percent, 
using a mildly modified version of the linear Kaiser formalism. 

Key words: galaxies: halos — large-scale structure of Universe — dark matter — 
cosmological parameters — gravitational lensing — methods: statistical 
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1 INTRODUCTION 

The past decade has seen the emergence of precision cos- 
mology. Various experiments that probe fluctuations in the 
cosmic microwave background (CMB), most notably the 
Wilkinson Microwave Anisotropy Probe (WMAP; Bennett 
et al. 2003) have yielded constraints on various cosmologi- 
cal parameters at the few percent level (Spergel et al. 2003, 



2007; Dunkley et al. 2009; Komatsu et al. 2009, 2011), and 
ongoing experiments, such as PLANCK, will tighten these 
constraints even further. It is important, though, to comple- 
ment these data with non-CMB constraints, such as those 
provided by supernova la, galaxy clustering, galaxy peculiar 
velocities, cluster abundances, gravitational lensing, Lyman 
a forest and, in the not too distant future, 21cm tomog- 
raphy of the neutral hydrogen at the era of reionization. 
These non-CMB constraints are crucial for (i) breaking var- 
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ious parameter degeneracies inherent in the CMB data * , 
(ii) constraining certain cosmological parameters that arc 
largely unconstrained by the CMB, such as evolution in the 
equation of state of dark energy, and (iii) for establishing a 
true concordance cosmology, i.e., a cosmological model that 
is in agreement with all possible data sets. 

With the advent of ever larger and more homogeneous 
galaxy redshift surveys, such as the Las Campanas Redshift 
Survey (LCRS; Shectman et al. 1996), the PSCz (Saunders 
ct al. 2000), the two-Degree Field Galaxy Redshift Survey 
(2dFGRS; Colless et al. 2003) and the Sloan Digital Sky 
Survey (SDSS; York et al. 2000), there has been a steady 
improvement in the tightness and reliability of the corre- 
sponding cosmological constraints. Most of these studies fo- 
cus on using galaxy clustering on large scales where one can 
rely on linear theory. Prime examples are constraints from 
(baryon acoustic oscillations in) the galaxy power spectrum 
(Percival et al. 2001; Cole et al. 2005; Eisenstein et al. 2005; 
Tegmark et al. 2006; Hiitsi 2006; Percival et al. 2007a,b,c; 
Padmanabhan et al. 2007; Gaztanaga, Cabre & Hui 2009; 
Percival et al. 2010; Blake et al. 2011; Anderson et al. 2012). 

However, recently it has also become feasible to accu- 
rately model galaxy clustering on small, non-linear scales 
using the halo model approach combined with halo occu- 
pation statistics. The halo model postulates that all dark 
matter is partitioned over dark matter haloes, and describes 
the dark matter density distribution in terms of the halo 
building blocks (e.g., Neyman & Scott 1952; Seljak 2000; 
Ma & Fry 2000; Scoccimarro ct al. 2001; Cooray & Sheth 
2002). When combined with a model that describes how 
galaxies with certain properties are distributed over dark 
matter haloes of different mass, this can be used to make 
predictions for the clustering properties of galaxies on all 
scales that are observationally accessible (e.g., Jing, Mo & 
Borner 1998; Berlind & Weinberg 2002; Cooray & Sheth 
2002; Yang, Mo & van den Bosch 2003). 

This approach has been used extensively in recent years 
to constrain the galaxy-dark matter connection, i.e., the con- 
nection between galaxy properties and halo mass, which 
holds important information regarding galaxy formation. 
On large, linear scales, the two-point correlation function 
between haloes of mass M can be written as £hh(f|A0 = 
bh(M) £mm(?"), with £mm(7") the two-point correlation func- 
tion of the linear matter distribution and bi 1 (M) the linear 
halo bias (e.g., Mo & White 1996). Similarly, for galaxies of 
a given luminosity, one has that £gg(?*|L) = bg(L) S^n(r), 
with b g (L) the bias of galaxies of luminosity L. Hence, one 
can use £ gg (r|L) to infer the average mass of haloes that 
host galaxies of luminosity L by simply finding the M for 

which bh(M) — [Cgg(r|L)/^Jj 1 I J n (7')] By comparing the ob- 
served abundance of galaxies of luminosity L to the pre- 
dicted abundance of haloes of mass M, one subsequently 
infers the average number of galaxies per halo. Hence, mea- 
surements of Cgg( r l-^) can be used to constrain halo occu- 
pation statistics, and this technique has been widely used 
(Jing et al. 1998, 2002; Peacock & Smith 2000; Bullock, 
Wechsler & Somerville 2002; Magliocchetti & Porciani 2003; 

t for instance, the CMB as measured by WMAP is consistent 
with a closed Universe with Hubble parameter h = 0.3 and no 
cosmological constant (e.g. Spergel et al. 2007) 



Yang et al. 2003, 2004; van den Bosch et al. 2003a, 2007; 
Porciani, Magliocchetti & Norberg 2004; Wang et al. 2004; 
Zehavi et al. 2004, 2005; Zheng 2004; Abazajian et al. 2005; 
Collister & Lahav 2005; Tinker et al. 2005, 2006; Lee et 
al. 2006). Note, though, that this method requires knowl- 
edge of both bh(M) and £mm( r )> both of which are strongly 
cosmology dependent. Consequently, the resulting halo oc- 
cupation statistics are also cosmology dependent (see e.g., 
Zheng et al. 2002; Berlind & Weinberg 2002; van den Bosch 
et al. 2007; Cacciato et al. 2009). Although this makes it 
difficult to calibrate galaxy formation models using halo oc- 
cupation statistics (e.g., Berlind et al. 2003), it also implies 
that one can use this method to constrain cosmological pa- 
rameters as long as one has some independent constraints 
on halo occupation statistics. 

Various approaches to constrain cosmological parame- 
ters along these lines have been used in recent years. Abaza- 
jian et al. (2005) have shown that the degeneracy between 
occupation statistics and cosmology can (at least partially) 
be broken by using the correlation function itself, as long as 
one includes data on sufficiently small scales (i.e., the one- 
halo term). Using the projected correlation functions mea- 
sured from the SDSS and allowing the cosmological param- 
eters to vary within constraints imposed by various CMB 
experiments, they were able to obtain constraints that were 
significantly tighter than those from the CMB alone, with 
Q m = 0.26 ± 0.03 and a 8 = 0.83 ± 0.04. 

Zheng et al. (2002) suggested that one can break the 
degeneracy between halo occupation model and cosmology 
by using the peculiar velocities of galaxies as inferred from 
the redshift space distortions in the two-point correlation 
function. This idea was used by Yang et al. (2004), who con- 
cluded that the power-spectrum normalization, as, needs to 
be of the order of ~ 0.75 (assuming f2 m = 0.3), significantly 
lower than the value then advocated by WMAP. Very simi- 
lar results were obtained by van den Bosch et al. (2007) and 
by Tinker et al. (2007). The latter used a much more sophis- 
ticated treatment of redshift space distortions developed by 
Tinker, Weinberg & Zheng (2006) and Tinker (2007). 

An alternative approach for breaking the degeneracy 
between halo occupation model and cosmology is to use con- 
straints on the (average) mass-to-light ratios of dark mat- 
ter haloes. This method was first used by van den Bosch et 
al. (2003b) and Tinker et al. (2005), who were able to obtain 
relatively tight constraints on f2 m and <T8 from combinations 
of clustering data plus constraints on the mass-to-light ratios 
of clusters. Interestingly, both studies again found evidence 
for a relatively low value of the power spectrum normaliza- 
tion: a 8 ~ 0.75 for fi m = 0.25. 

Along similar lines, one can also use a combination of 
clustering and galaxy-galaxy lensing. The latter effectively 
probes the galaxy-dark matter cross correlation, and there- 
fore holds information regarding the mass-to-light ratios of 
dark matter haloes covering a wide range in halo mass. Since 
its first detection by Brainerd, Blandford & Smail (1996), 
the accuracy of galaxy-galaxy lensing measurements has in- 
creased to the extent of yielding high signal-to-noise ratio 
measurements over a significant dynamic range in galaxy lu- 
minosity and/or stellar mass (e.g., Fisher et al. 2000; Hoek- 
stra et al. 2002; Sheldon et al. 2004, 2009; Mandelbaum 
et al. 2006, 2009; Leauthaud et al. 2007). Similar to the 
galaxy-galaxy autocorrelation function, the galaxy-matter 
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cross correlation function can be accurately modeled using 
the halo model (Guzik & Seljak 2001, 2002; Mandelbaum 
et al. 2005; Yoo et al. 2006; Cacciato et al. 2009; Leau- 
thaud et al. 2011, 2012; van Uitert et al. 2011). Hence, the 
combination of galaxy clustering and galaxy-galaxy lensing 
is ideally suited to constrain cosmological parameters, as 
demonstrated in detail by Yoo et al. (2006) . A first applica- 
tion of this idea by Seljak et al. (2005), using the model of 
Guzik & Seljak (2002) and the galaxy-galaxy lensing data 
of Mandelbaum et al. (2006), combined with WMAP con- 
straints, yielded as = 0.88 ±0.06, only marginally consistent 
with the values obtained from the cluster mass-to-light ra- 
tios and/or the redshift space distortions mentioned above. 
However, more recently, two different analyses based on the 
same galaxy-galaxy lensing data by Cacciato et al. (2009) 
and Li et al. (2009) both argued that a flat ACDM cosmol- 
ogy with (f2 m ,<78) = (0.238,0.734) is in much better agree- 
ment with the data than a (0.3, 0.9) model. Although the 
reason for the disagreement between these studies and that 
of Seljak et al. (2005) is probably related to the different 
modelling approaches, these studies all have demonstrated 
that a combination of clustering and lensing data holds great 
potential for constraining cosmological parameters. 

This is the first paper in a series in which we use a 
combination of galaxy clustering and galaxy-galaxy lensing 
data to constrain cosmological parameters. In this paper we 
present the theoretical framework and test the accuracy of 
our method using mock data. In More et al. 2012a (here- 
after Paper II) we present a Fisher matrix analysis to iden- 
tify parameter-degeneracies and to assess the accuracy with 
which various cosmological parameters can be constrained 
using the methodology presented here. Finally, in Cacciato 
et al. 2012b (hereafter Paper III) we apply our analysis to 
the actual SDSS data to constrain cosmological parameters 
(in particular fi m and as) under the assumption of a 'stan- 
dard' fiat ACDM cosmology. 

Throughout this paper, unless specifically stated other- 
wise, all radii and densities will be in comoving units, and log 
is used to refer to the 10-based logarithm. Quantities that 
depend on the Hubble parameter will be written in units of 
h = ff /(100 kms^Mpc- 1 ). 



(but see §2.3 below). The ESD, A£(i?|Li, L 2 , z), is related 
to the tangential shear, 7t(i?|Li, L 2 , z), measured around 
galaxies (the lenses) at redshift z with luminosities in the 
range [L\, L 2 ] according to 

AE( J R|L 1 ,L 2 ,z) = E(< R\L 1 ,L 2 ,z)~E(R\L 1 ,L 2 ,z) 

= j t (R\L u L 2 ,z)T, cllt . (2) 

Here E crit is a geometrical parameter that depends on the 
comoving distances of the sources and lenses, E(i?|Li, L 2 , z) 
is the azimuthally-averaged projected surface mass density 
of the gravitational lenses, which is related to the galaxy- 
matter cross correlation function, £ g m(r|Li, L 2 ,z), according 
to 

E(R\L ly L 2 ,z) = 2pm{z) 

[l + £Ur\Li,L 2 ,z)] ^§=f , (3) 

and E(< R\L\, L 2 , z) is its average inside R; 

1 r R 

Y.{<R\U,L 2 ,z) = — J E(R'\Li, L 2 , z) R' dR' , (4) 

(Miralda-Escude 1991; Sheldon et al. 2004; see also §2.2). 

In this section, we present analytical expressions for 
w p (r p |Li, L 2 , z), AE(Ji|Li, L 2 , z) and &(L,z). For com- 
pleteness and clarity we present a detailed, step-by-step 
derivation of our method, and we will emphasize where it 
differs from that of previous authors. The backbone of our 
model is the halo model, in which the matter distribution 
in the Universe is described in terms of its halo building 
blocks (sec Cooray & Sheth 2002 and Mo, van den Bosch 
& White 2010 for comprehensive reviews). After a detailed 
description of how the halo model can be used to compute 
the power spectrum of the dark matter mass distribution 
(§2.1), we show how the halo model can be complemented 
with a model for halo occupation statistics which allows one 
to compute w p (r p \Li, L 2 , z), AE(i?|Li, L 2 , z) and <&(L,z) 
for a given cosmology. 

In order to keep the derivations concise, in what follows 
we will not explicitly write down the dependencies on L\ 
and L 2 . 



2 MODEL DESCRIPTION 

Our main goal is to use galaxy clustering and galaxy-galaxy 
lensing, measured as function of luminosity from the main 
galaxy sample in the SDSS, to simultaneously constrain cos- 
mology and halo occupation statistics. As detailed in papers 
II and III, the data that we will use consists of (i) the galaxy 
luminosity function, &(L, z), at the median redshift of the 
SDSS main galaxy sample (z ~ 0.1), (ii) the projected two- 
point correlation functions, w p {r p \L\, L 2 , z), for galaxies in 
six luminosity bins, [Li,L 2 ], each with its own median red- 
shift z, and (iii) the corresponding excess surface densities 
(ESD), AT,(R\L u L 2 ,z). 

The projected correlation function, w p (r p \Li, L 2 , z), is 
related to the corresponding galaxy-galaxy correlation func- 
tion in real space, £(r|Li, L 2 , z), via a simple Abel integral 

Z* 00 t dv 

w p {r p \L!,L 2 ,z) = 2 / £ ss (r\L u L 2 ,z) = , (1) 



2.1 The halo model 

Throughout this paper we define dark matter haloes as 
spherical overdensity regions with a radius, r 2 oo, inside of 
which the average density is 200 times the average density 
of the Universe. Hence, the mass of a halo is 

M= — 200 p m rf 00 . (5) 

Under the assumption that all dark matter is bound in 
virialized dark matter haloes, the density perturbation field 
at redshift z, defined by 

tf m (x,z)=MM-l ) (6) 

pm 

can be written in terms of the spatial distribution of 
dark matter haloes and their internal density profiles. 
Throughout we assume that dark matter haloes are spher- 
ically symmetric and have a density profile, ph(r\M, z) = 
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Mv,h{r\M, z), that depends only on mass, M, and redshift, 
z. Note that / u h (x|M, z) d 3 x = 1. 

Now imagine that space (at some redshift z) is divided 
into many small volumes, AVi, which are so small that none 
of them contains more than one halo center. The occupation 
number of haloes in the i th volume, Nh,i, is therefore either 
or 1, and so Afh,i = A/^ :i = N^.... In terms of these 
occupation numbers the density field of the (dark) matter 
can formally be written as 



pm(x, Z) = ^ A/h,i Mj M h (x -Xj\Mj,z), 



(7) 



where Mi is the mass of the halo whose center is in AVi. 
Using that the ensemble average Mi «h(x — Xj|Mj, z)) 

is equal to J AM n(M, z) M AVi «h(x - x;|M, z), where 
n(M, z) is the halo mass function, we have that 



<Pm(x,z)) 



-/ 



dM M n(M, 2) 2^ AVi u h (x - Xi |M, z) 



dMMn(M,z) 



M h (x-x'|M, 2) 



= Pn 



(8) 



where the last equality follows from the normalization of 
«h(x|M, z) and from the halo model ansatz that all dark 
matter is partitioned over dark matter haloes. 

Similar to S m we can also define the halo density con- 
trast Sh- Ignoring possible stochasticity in the relation be- 
tween Sm and <5h, we can use a Taylor series expansion to 
write 



5 h (x;M, z) = 8 h {8 m ) = ^2 



6h,n(M,2) 



C(x, Z ), 



(9) 



(Fry & Gaztanaga 1993; Mo, Jing & White 1997), where 6 h ,n 
is called the halo bias factor of order n. Although the require- 
ment that (Sh) = implies that & n ,o = — X^^L 2 bh,n{5 m ) /n\, 
which in general is not zero, one can ignore & n ,o since in 
Fourier space it only contributes to the galaxy power spec- 
trum for wavevector k = 0. Furthermore, on large scales we 
have that |<J m | <C 1, so that we can also neglect the higher- 
order (n > 1) bias factors. Hence, on large scales the cross 
correlation function of haloes of mass M\ and haloes of mass 
M2 can be written as 



£ hh (r I Mi , M 2 , z) ~ b h (Mi , z) b h (M 2 , z) £ m n m (r, z) , 



(10) 



where (!™ m (r, 2) is the two-point correlation function of the 
initial density perturbation field, linearly extrapolated to 
redshift z, and we have used bh(M, z) as shorthand notation 
for the linear halo bias bh,i(M, z). One can extend this pre- 
scription to the mildly non-linear regime, in which one can 
no longer ignore the higher-order bias terms, by replacing 
Cmm( r , z ) with the non-linear two-point matter correlation 
function, £ mm (r, z), and by including a radial dependence of 
the halo bias, C,(r,z) (which effectively captures the effect 
of the higher-order bias parameters, see §3.4 below). Under 
the assumption that haloes are spherical, one then obtains 
that 

l + £hh(r|Mi,M 2 ,z) = (11) 
[1 + 6h(Mi, z)K(M 2 , z)C(r, z)£ mm (r, z)] Q(r - r min ) , 



where Q(x) is the Heaviside step function, which assures 
that £hh{r, z\Mi, M2) = —1 for r < r m i n in order to 
account for halo exclusion, i.e., the fact that dark mat- 
ter haloes cannot overlap. In principle, one expects that 
r m in = r min (Mi, M 2 , z) = r 20 o(Mi,z) + r 2 oo(Af 2 , z). How- 
ever, the halo finder used by Tinker et al. (2008), whose 
halo mass function we use, does allow overlap of haloes 
in that any halo is considered a host halo as long as 
its center does not lie within the outer radius of an- 
other halo. Therefore, to be consistent, we follow Tinker 
et al. (2012) and Leauthaud et al. (2011), and adopt that 

' min 

= MAX [r 2 oo(Afi, z), r 2 oo(Af2, z)]. 

For computational convenience, we will be working in 
Fourier space. To that extent we define the Fourier transform 
of /9m (x, z) as 



Pm(k,z) 



V 



1 \ — ik-x i3 

p m (x, z)e d x 



= ^2U Ki MiU h (k.\Mi,z)e" 



(12) 



where V is the volume over which the Universe is assumed 
to be periodic, and 



u h (k\M,z) : 



/» 



(xjM, 2 )e- lk ' x d 3 x, 



(13) 



is the Fourier transform of the normalized halo density pro- 
file. With our definition of the Fourier transform, the (non- 
linear) matter-matter power spectrum is defined as 



P mm (k, 2 ) = V(|* m (*)| 
V 



= — (p m (k,z)p m (Kz))~VS u (k)., 



where 
5 



D(k) = v J 



e d x , 



(14) 



(15) 



is the Dirac delta function, p* indicates the complex conju- 
gate of p, and we have used that p m (0) = p m . 
Using Eq. (12) we have that 



(Pm(k, 2)p m (k, Z)) = ^ 



(16) 



(Mi,i AfjMij Mj u h (k\Mi, z)u* h (k\M J ,z)e' lk -^-^ ) ) , 

which we split in two terms: the one-halo term, for which 
j — i, and the two-halo term with j / i. The former can be 
written as 

(p m (k, 2 )p m (k, 2 )) lh = ^J2(^ Ki Mf\u h (\c\Mi,z)\ 2 ) 



1 / dMM 2 n(M,z)\u h (]i\M,z)\ 2 , 



(17) 



where we have used that M 2 ^ = Mi,i- For the 2-halo term 
we use the fact that we are free to choose AVi arbitrary 
small, so that 



(Pm(k,2)/9 m (k, z)) 2h = j^^ 1 ^ 

J dMi Mi n(M\, z) Wh(k|Mi, 2) 



d 3 y2 
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/ 



dM 2 M 2 n(M 2 , z) Mh(k|M 2 , z) 



[i + U(yi - y 2 , «|Mi, m 2 )] e - Ik '< yi - y2 » . 



(18) 



Here we have accounted for the fact that dark matter haloes 
are clustered, as described by the two-point halo-halo corre- 
lation function £hh(r, z\Mi, M 2 ). 

Hence, using Eq. (11), which properly accounts for halo 
exclusion, we have that 

(p m (k,z)p^(k,^)) 2h = 1 J dMiMm(Mi,z)uh(k|Mi,z) 



/ 



dM 2 M 2 n(M 2 , z) u h (k|M 2 , z)Q{k\M 1 ,M 2 ,z) . (19) 



Here we have used that u*(k\M, z) — u(k\M, z), which fol- 
lows from the fact that u(x\M,z) is real and even, and we 
have defined 

Q(k\Mi,M 2 ,z) = 

4tt f [l + £ hh (r,,z|Mi,M 2 )] ^Vdr, (20) 

with k = |k| and with £ hh (r, z|Mi, M 2 ) given by Eq. (11). 

Combining Eqs. (14)- (19), and using that haloes are 
defined to be spherically symmetric, we finally obtain that 

P mm (k, 2) = P^ h m (fc, z) + Pl b m (k, z) - VS D (k) , (21) 

where 

Pmm(M) = 4- [ AMM 2 n(M,z)\u h (k\M,z)\ 2 , (22) 
and 



-Pmm(^i ^) 



4- / dMiMin(Mi,z)«h(ft|Mi,z) 
dM 2 M 2 n(M 2 , z) u h (k\M 2 ,z)Q(k\M!,M 2 , z) . (23) 



Our treatment of halo exclusion is similar to that of Smith, 
Scoccimarro & Sheth (2007) and Smith, Desjacques & 
Marian (2011), except that we have included the (semi- 
empirical) factor f (r, z) to account for the radial dependence 
of halo bias. As shown in Smith et al. (2011), Eq. (23) has the 
correct asymptotic behavior at both large and small scales. 
This is an important improvement over a number of approx- 
imate methods that have been advocated and which typi- 
cally involve adopting an upper limit for the mass interval 
used in the integral for the 2-halo term of the power spec- 
trum (e.g., Takada & Jain 2003; Zheng 2004; Abazajian et 
al. 2005; Tinker et al. 2005, 2012; Yoo et al. 2006; Leauthaud 
et al. 2011). None of these methods, however, are mathemat- 
ically correct. Furthermore, accurate, numerical evaluation 
of Eq. (23) is not significantly more CPU demanding than 
using the approximate method, largely rescinding its main 
motivation. Finally, as shown in Smith et al. (2011), Eq. (23) 
has the additional advantage that it appears to resolve the 
well-known problem of excess large-scale power in the halo 
model. This problem arises due to the fact that the 1-halo 
term approaches a constant value on large scales in Fourier 
space, significantly in excess of the shot noise (see discus- 
sions in Cooray & Sheth 2002; Smith et al. 2003; Crocce & 
Scoccimarro 2008). A proper treatment of halo exclusion, as 
adopted here, (almost) nullifies this large scale power of the 
1-halo term. 



2.2 The galaxy-galaxy correlation function 

If one assumes that each galaxy resides in a dark mat- 
ter halo, the halo model described above can also be used 
to compute the galaxy-galaxy correlation function or the 
galaxy-matter cross correlation function. All that is needed 
is a statistical description of how galaxies are distributed 
over dark matter haloes of different mass. To that extent we 
use the conditional luminosity function (hereafter CLF) in- 
troduced by Yang et al. (2003). The CLF, $(L|M)dL, spec- 
ifies the average number of galaxies with luminosities in the 
range L ± dL/2 that reside in a halo of mass M. 

Throughout we ignore a potential redshift dependence 
of the CLF. Since the data that we use to constrain the CLF 
only covers a narrow range in redshift (see Paper III), this 
assumption will not have a strong impact on our results. 
Once the CLF is specified, the galaxy luminosity function 
at redshift z, <£>(L, z), simply follows from integrating over 
the halo mass function, n(M,z); 

$>(L,z) = f <S>(L\M)n(M,z)dM . (24) 

In what follows, we will always be concerned with galaxies in 
a specific luminosity interval [L\,L 2 ]. The average number 
density of such galaxies follows from the CLF according to 

n s (z) = J{N g \M)n(M,z)dM, (25) 

where 

(N e \M) = f \(L\M)AL, (26) 

J Li 

is the average number of galaxies with L\ < L < L 2 that 
reside in a halo of mass M. 

For reasons that will become clear below, we split the 
galaxy population in centrals (defined as those galaxies that 
reside at the center of their host halo) and satellites (those 
that orbit around a central), and we split the CLF in two 
terms accordingly: 



$(L|M) = ® C (L\M) + * S (L|M) . 



(27) 



where $ C (L|M) and & S (L\M) represent central and satel- 
lite galaxies, respectively (cf., Cooray & Milosavljevic 2005). 
Similarly, we write the number density of galaxies, n g (x, z). 
as the sum of the contribution of centrals, n c (x, z), and that 
of satellites, n s (x, z), so that 



5 g (x,z) 



n g (x, z) - n g (z) 
n g (z) 

/c(z)<5 c (x, z) + f s (z)S s (x, z) 



(28) 



Here / c (z) = n c (z)/n g (z) is the central fraction, f s (z) = 
n s {z)/n g (z) = 1 — fc(z) is the satellite fraction, and <5 c (x, z) 
and <5 s (x, z) are the number density contrasts of centrals 
and satellites at redshift z, respectively. Note that n c (z) and 
n s (z) simply follow from Eq. (25) by replacing $(L|M) in 
Eq. (26) by <E> c (LjM) and $ S (L|M), respectively. 

The detailed functional form that we adopt for $(L|M) 
is discussed in §3.7. In this subsection we show how the CLF 
enters in the computation of the (projected) galaxy-galaxy 
correlation function, w p (r p \Li , L 2 , z), and in the excess sur- 
face density profile, AE(_R|Li, L 2 , z). 

Within the framework of the halo model, we can write 
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n c (x,z) 



Mi,iATc,i <5 D (x - Xi) . 



(29) 



where M c .i is the number of central galaxies in the halo 
whose center is in volume element i (i.e., M c ,i is either 
or 1). The Dirac delta function expresses the fact that a 
central, by definition, resides at the center of a dark matter 
halo. Similarly, for the satellite galaxies we can write 



TZs(x,z) 



Mi,iA/"s,i«s(x — ■Xi\Mi,z) 



(30) 



where A4,i is a positive integer indicating the number of 
satellite galaxies that reside in the halo whose center is in 
volume element i, and u s (r\M,z) describes the normalized 
radial distribution of satellite galaxies in an average halo of 
mass M at redshift z*. 

Using Eq.(28), the galaxy-galaxy power spectrum can 
be written as 

P ss (k, z) = f*(z)P cc (k, z) + 2f c (z)f B (z)P cs (k, z) 

+f'i(z)P ss (k,z), (31) 

while the galaxy-matter cross power spectrum is given by 

P sm (k,z) = f c {z)P C m{k, z) + f s (z)P Brn (k,z) . (32) 

Using the same methodology as in §2.1 for the dark 
matter, we split each of these five power-spectra into a 1- 
halo and a 2-halo term. The various 2-halo terms are given 
by 



P xy (fc, z) 



1 



J dM 1 Ux(k\M 1 ,z)n(M 1 ,z) 
dM 2 H y (k\M 2 ,z) n(M 2 , z) Q(k\M u M 2 ,z), (33) 



where 'x' and 'y' are either 'c' (for central), 's' (for satellite), 
or 'm' (for matter), Q(k\Mi, M 2 , z) is given by Eq. (20), and 
we have defined 



H m (k, M,z) = — u h (k\M, z) . 



Hc(k,M,z)=H c {M,z) 



(N C \M) 
n c (z) 



(34) 
(35) 

(36) 



and 

H s (k,M,z) = { -^lu s (k\M,z). 

n s (z) 

Here (N C \M) and (N S \M) are the average number of central 
and satellite galaxies in a halo of mass M, which follow from 
Eq. (26) upon replacing $(L|M) by <b c (L\M) and <E> S (L|M), 
respectively. 

For the 1-halo terms, one obtains 



Pcc h (M)= 1 



n c (z) 



(37) 



•t Strictly speaking, by writing n s (x, z) in terms of u s (r\M, z) we 
have already taken an ensemble average over all possible spatial 
realizations of the satellite galaxies in a halo of mass M at red- 
shift z. Hence, the number density distribution of Eq. (30) does 
not correspond to a single realization, as it should. However, since 
we are only concerned here with power-spectra, which are any- 
ways based on ensemble averaging, Eq. (30) is adequate for what 
follows. 



P^{k,z) = J dMHc(M,z)H s (k,M,z)n(M,z), 



and 



P e 1 e h (k,z) = A P 



J AM HI 



(k,M,z)n(M,z) . 



(38) 



(39) 



Here we have assumed that the occupation numbers of cen- 
trals and satellites are independent, so that (N C N S \M) = 
(N C \M) (N B \M), and we have introduced the parameter 



A 



(N a (N s — 1)|M) 



(40) 



(iV s |M) 2 

If the occupation number of satellites follows a Poisson dis 
tribution, i.e., 

X Ne e" A 



P(N S \M) 



NJ 



(41) 



with A = (iV s |M), then Ap = 1, while values of ,4p larger 
(smaller) than unity indicate super- (sub-) Poisson statistics. 

2.3 The Projected Correlation Function and 
Excess Surface Density 

Once P gg (fc,z) and P gul (k,z) have been determined, it is 
fairly straightforward to compute the projected galaxy- 
galaxy correlation function, w p (r p ,z), and the excess sur- 
face density (ESD) profile, AS(P, z). We start by Fourier 
transforming the power-spectra to obtain the two-point cor- 
relation functions: 



£xy(r,2) = 



(27T)3 / 

jl r 

Wo 



,(k,z) e +ikx d 3 k 



P xy (fc,z) 



sin kr 
kr 



k dk . 



(42) 



where 'x' and 'y' are as defined above. 

As discussed above, the excess surface density profile 



AS(P, z) = E(< R, z) - E(P, z) , 



(43) 



where E(< R,z) is given by Eq. (4). The projected surface 
density, E(_R, z), is related to the galaxy-matter cross corre- 
lation, £ gm (r, z), according to 



/ 



T,(R,z) = pm / [l + £gm(r,z)] dw, 



(44) 



where the integral is along the line of sight with cj the co- 
moving distance from the observer. The three-dimensional 
comoving distance r is related to cj through r 2 = col + uj 2 — 
2wl^cos#. Here ojl is the comoving distance to the lens, 
and 9 is the angular separation between lens and source (see 
Fig. 1 in Cacciato et al. 2009). Since £ gm (f, z) goes to zero 
in the limit r — ¥ co, and since in practice 9 is small, we can 
approximate T,(R,z) using Eq. (3), which is the expression 
we adopt throughout. 

The projected galaxy-galaxy correlation function is de- 
fined as 



w P (r p ,z) 



Jo 



Cgg(^p,^,«) dr-n 



(45) 



Here r p is the projected separation between two galaxies, 
r,r is the redshift-space separation along the line-of-sight, 
and £ gg (r p , rv, z) is the measured two-dimensional correla- 
tion function, which is anisotropic due to the presence of 
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peculiar velocities. In the limit r max — ► oo, the projected 
correlation function (45) is completely independent of these 
peculiar velocities, simply because they have been integrated 
out. In that case, w p (r p ) can be written as a simple Abel 
transform of the real-space correlation function: 



w pO p ,z) = 2 / £ gg (r,z) 



r dr 



(46) 



(Davis & Peebles 1983). However, since real data sets are 
always limited in extent, in practice the projected corre- 
lation function w p (r p ,z) is always obtained by integrating 
£gg( r p> r "-j z) out to some finite r max rather than to infinity. 
For example, Zehavi et al. (2011), whose data we use in Pa- 
per III, adopt r max = 40/i -1 Mpc or 60/t -1 Mpc, depending 
on the luminosity sample used. This finite integration range 
is often ignored in the modeling (e.g., Magliocchetti & Por- 
ciani 2003; Collister & Lahav 2005; Wake et al. 2008a,b) 
or is 'accounted' for by computing the model prediction 
for w p (r p ,z) using Eq. (46), but integrating from r p out 
to r out = yfp + 7*m ax , where r max is the same value as used 
for the data (e.g., Zehavi et al. 2004, 2005, 2011; Abazajian 
et al. 2005; Tinker et al. 2005; Zheng et al. 2007, 2009; Yoo 
et al. 2009). However, as we demonstrate in §4.5 below, this 
introduces errors that can easily exceed 40 percent or more 
on the largest scales probed by the data (~ 20h~ l Mpc; see 
also Padmanabhan, White & Eisenstein 2007; Norberg et 
al. 2009; Baldauf et al. 2010). This is due to the fact that 
the peculiar velocities on scales r > r max cannot be ignored. 
In order to take these residual redshift space distortions into 
account, we make the assumption that the large scale pecu- 
liar velocities are completely dominated by linear velocities 
(i.e., those that derive from linear perturbation theory), and 
that the non-linear motions that give rise to the Finger-of- 
God effect have been integrated out. In that case we can 
correct Eq. (46) for the fact that the projected correlation 
function has been obtained using Eq. (45) with a finite r max 
as follows: 



w p (r p ,z) = 2/ corr (r p ,z) 



£gg(r,«) 



r dr 



where /corr(r p , z) is the correction factor given by 



/corr(^p, %) — 



(47) 



(48) 



\A 2 -r-p 



Here (r, z) and £gg (r p , rv, z) are the linear two-point cor- 
relation functions of galaxies at redshift z in real space and 
redshift space, respectively. For the former we may write 



(49) 



with Cmm^)* 2 ) the two-point correlation function of the ini- 
tial matter field, linearly extrapolated to redshift z, and 



b(z) = - 



n g (z) 



(N e \M) b h (M, z) n(M, z) AM , 



(50) 



is the mean bias of the galaxies in consideration. For the 
linear galaxy correlation function in redshift space we can 
write 



Cgg n (r p ,r.,z) = ^6i(s,z)P 2i (M) 



(51) 



(e.g., Kaiser 1987; Hamilton 1992). Here s = ^/r\ + rl is 
the separation between the galaxies in redshift space, fi = 
r-^/s is the cosine of the line-of-sight angle, Vi(x) is the I th 
Legendre polynomial, and £o, £2, and £4 are given by 



£o(r,z) 

6(r,z) 

£4(r, z) 
where 
J„(r, z) 
and 



+ |/3 + i/3 2 ) &,z), 
(P+fr) [4 n (r,z)-3J 3 (r,z)] , 



35 



£gg(r,2) + T J 3< r ' 2 ) 



35 



= J Q (w> 



z)y n 1 dy. 



1 /dln^X _ Q°J(z) 
^ K 1 b(z) \d\na) z b{z) 



(52) 
(53) 
(54) 

(55) 
(56) 



with a = 1/(1 + z) the scale factor and D(z) the linear 
growth rate. 

As we demonstrate in §4.5, although this correction is 
fairly accurate on large scales ( >, 3/i _1 Mpc), on smaller 
scales it introduces an error of a few percent (see also Bal- 
dauf et al. 2010). Detailed tests with mocks indicate that 
this problem can be avoided by simply replacing the lin- 
ear galaxy-galaxy correlation function in the Kaiser formal- 
ism with its non-linear analog; i.e., by replacing in Eq. (48) 
and Eqs. (52)-(55) each occurrence of £gg (r, z) with £ gg (r, z) 
computed from Eq. (42) using the model outlined in §2.2. 
This is the method we will use throughout whenever we com- 
pute w p (r p ,z) for comparison with data, always using the 
same r max as used for the data (see Paper III) and with b(z) 
computed from our CLF model using Eq. (50). Note that 
with this modified version of the Kaiser formalism, the de- 
nominator of /corr in Eq. (48) is exactly equal to the integral 
in Eq. (47). Hence, there is no need to compute the correc- 
tion factor; rather, w p (r p ) can simply be obtained directly 
using Eq. (45) with £gg( r p> r n , z) given by Eqs. (51)-(55), 
but with £g™ (r, z) replaced by £ gg (r, z) (see §4.5 for details). 



3 MODEL INGREDIENTS 

The model described in the previous section requires a num- 
ber of ingredients, namely the halo mass function, n(M, z), 
the halo bias function, bh(M, z), the radial bias function, 
C,(r,z), the linear and non-linear matter power spectra, 
Pnim(fc, z ) an d Pmm(k, z), respectively, the (normalized) halo 
density profile, Uh(r\M), the (normalized) radial number 
density distribution of satellite galaxies, u B (r\M), and the 
halo occupations statistics (N C \M) and (N S \M). We now 
discuss these ingredients in turn. 



3.1 Matter Power Spectra 

In our fiducial model, which includes a treatment of halo 
exclusion, we require both the linear and the non-linear 
two-point correlation functions of the matter, ^Sn(r,z) and 
Cmm(r, z), which are the Fourier transform of the linear 
and non-linear power-spectrum, P^ n (k,z) and P mm (fc,z), 
respectively. 
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Throughout we compute P mm (fc, z) using the fitting for- 
mula of Smith et al. (2003) § which is modeled on the basis 
of the linear matter power spectrum, 

lii 



PZ,(k,z)cKD\z)T\k)k 



(57) 



Here n B is the spectral index of the initial power spectrum, 
T(k) is the linear transfer function, and D(z) is the linear 
growth factor at redshift z, normalized to unity at z = 0. We 
adopt the linear transfer function of Eisenstein & Hu (1998), 
which properly accounts for the baryons, neglecting any con- 
tribution from neutrinos and assuming a CMB temperature 
of 2.725K (Mather et al. 1999). The power spectrum is nor- 
malized such that the mass variance 



2 (M) = 



1 

2^2 



P 1 J 1 n m (k,0)W 2 (kR) k 2 dk, 



is equal to erf for R — 8h 1 Mpc. Here 
3(sin kR — kR cos kR) 



W{kR) = 



(kR) 3 



(58) 



(59) 



is the Fourier transform of the spatial top-hat filter, and M 
is related to R according to M = 47rp m 7? 3 /3. 



3.2 Halo Mass Function 

For the halo mass function, n(M, z), which specifies the co- 
moving abundance of dark matter haloes of mass M at red- 
shift z, we use the results of Tinker et al. (2008, 2010), who 
have shown that the halo mass function is accurately de- 
scribed by 



\ Pm s din v 

n (jf,,) = ^,/M— , 



(60) 



where v — S BC (z)/a(M), with <5 sc (z) the critical overdensity 
required for spherical collapse at z, and 



f(v) =r]o [l + (771W 



2V2 1 p 2 V3 e 



-7)4" /2 



(61) 



For our definition of halo mass (see §2.1), Tinker et al. (2010) 
find that »ji = 0.589(1 + z) ' 20 , 772 = -0.729(1 + z) I -0 ' 08 , 
773 = -0.243(1 + z) ' 27 , and r? 4 = 0.864(1 + z)" 01 , while 
Vo = Vo(z) is set by the normalization condition 



/ 



6h(i/)/(i/)di/ = i, 



(62) 



with 6h(i^) the halo bias function of Tinker et al. (2010), 
specified in §3.3 below. This normalization expresses that 
the distribution of matter is, by definition, unbiased with 
respect to itself. 

Throughout we adopt 



S sc (z) = 0.15(127r) 2/3 



[fi m (z)] c 



D(z) 



(63) 



which is a good numerical approximation to the critical 
threshold for spherical collapse (Navarro, Frenk & White 
1997). 



8 We use the small modification suggested on John Peacock's 
website http://www.roe.ac.uk/~jap/halocs/, although it has no 
significant impact on any of our results. 



3.3 Halo Bias Function 

For the halo bias function we adopt the fitting function of 
Tinker et al. (2010), which for our definition of halo mass, 
can be written as 



b h (M,z) = 6hH = 

,,0.1325 

1 - 



V 



j/>.1325 + I Q716 



+ 0.1830-/' 5 + 0.2652-y 2 ' 4 (64) 



where, as before, v = 8 ac (z)/cr(M), 

Although we believe the halo mass function and halo 
bias function obtained by Tinker et al. (2008, 2010) to be 
the most accurate to date, it is important to realize that 
they still can carry uncertainties that can potentially impact 
cosmological results. It is unclear if such uncertainties affect 
just the mass function normalization and not its shape. We 
will carry out a proper investigation of this issue in future 
work. Throughout this paper, however, we restrict ourselves 
to the n(M, z) and b h (M, z) specified above. 



3.4 Radial Bias Function 

An important ingredient of the halo model is the radial bias 
function, £(r, z), which accounts for the fact that Eq. (10) 
becomes inaccurate in the quasi-linear regime, by making 
halo bias scale dependent, i.e., it effectively describes the 
impact of the non-zero higher-order bias factors in Eq. (9). 

Ideally, the radial dependence of the halo bias is to be 
computed from first principles using, for example, (renor- 
malized) perturbation theory (e.g., Crocce & Scoccimarro 
2006; McDonald 2006,2007; Smith, Scoccimarro & Sheth 
2007; Elia et al. 2011). However, it remains to be seen 
whether these techniques can yield reliable results in the 
quasi-linear regime of the 1-halo to 2-halo transition region, 
which will probably require an impracticable large number 
of orders or loops in the perturbation series. In the absence 
of such an analytical solution we have to resort to empir- 
ical fitting functions calibrated against numerical simula- 
tions. Throughout, we adopt the fitting function of Tinker 
et al. (2005), given by 



Co(r,z) 



[l + 1.17( mm (r,2)] L ' 
[1 + 0.69 £ m m(r,z)] 2 -' 



(65) 



The subscript indicates that this fitting function was cal- 
ibrated using iV-body simulations in which the haloes were 
identified using the friends-of-friends (FOF) percolation al- 
gorithm (e.g., Davis et al. 1985), with a linking length of 0.2 
times the mean inter-particle separation. However, the halo 
mass function and halo bias function used here are based 
on the spherical overdensity algorithm. As already pointed 
out in Appendix A of Tinker et al. (2012), because of these 
different halo definitions, the fitting function (65) is likely 
to be inadequate on small scales, which we indeed find to be 
the case (see §4.2 below). After some trial and error, while 
assuring an easy numerical implementation, we decided to 
adopt the following, modified, radial bias function 



(o(r, z) if r > ?> 



(o(r^,z) if r < r$ 
where the characteristic radius, r^, is defined by 
log [Co (rv , z) £mm (r i ,,z)]=tp 



(66) 



(67) 
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where tp is a free parameter to be calibrated against nu- 
merical simulations (see §4.2). Note that if Eq. (67) has no 
solution, e.g., when ip — > +00, we set — 0, which cor- 
responds to simply using the fitting function (65) without 
modification. 



3.5 Density Profile of Dark Matter Haloes 

We assume that dark matter haloes are spheres whose nor- 
malized density distribution is given by the NFW profile 



u h (r\M) = 



M (r/r*)(l+r/r«) 2 



(68) 



(Navarro, Frenk &: White 1997), where r* is a characteristic 
radius and 8200 is a dimensionless amplitude which can be 
expressed in terms of the halo concentration parameter c = 
r20o/r* as 



_ 200 <? 

200 ~ 3 ln(l + c) - c/(l + c) 



(69) 



Numerical simulations show that c is correlated with halo 
mass. Throughout our work we use the concentration-mass 
relation of Maccio et al. (2007), properly converted to our 
definition of halo mass. 

The Fourier transform of the NFW profile, which fea- 
tures predominantly in our model, is given by 

3<52oo 



Mk\M, z) = f^f (cos M [Ci( P + nc) - Ci( M )] + 



200c 3 

sin/i [Si(/i + fic) — Si(fi)] 



sin /ic 

H + /J-c 



(70) 



where fi = fcr», and Si(:r) and Ci(x) are the standard sine 
and cosine integrals, respectively. 

Note that this model for dark matter haloes is highly 
oversimplified. In reality, haloes are triaxial, rather than 
spherical, have scatter in the concentration-mass relation, 
have substructure, and may have a density profile that dif- 
fers significantly from a NFW profile due to the action of 
baryons. A detailed discussion regarding the impact of these 
oversimplifications on our results is presented in §5. 



3.6 Radial Number Density Distribution of 
Satellites 

Throughout, we assume that satellite galaxies follow a radial 
number density distribution given by a generalized NFW 
profile (e.g., van den Bosch et al. 2004): 



(71) 



so that u B oc r~ 7 and u B oc r~ 3 at small and large radii, 
respectively. Here 1Z and 7 are two free parameters, while 
the scale radius r* is the same as that for the dark matter 
mass profile (Eq. [68]). For our fiducial model, we adopt 71 = 
7 = 1 so that u s (r\M) = Uh(r\M), i.e. satellites are unbiased 
with respect to the dark matter. For consistency with our 
definition of halo mass, we only adopt profile (71) out to 
7*200 (i-e., all satellites have halo-centric radii r < r2oo). 

Observations of the number density distribution of 
satellite galaxies in clusters and groups seem to suggest that 
u s {r\M) is in reasonable agreement with an NFW profile, 
for which 7=1 (e.g., Beers & Tonry 1986; Carlberg, Yee 



& Ellingson 1997a; van der Marel et al. 2000; Lin, Mohr & 
Stanford 2004; van den Bosch et al. 2005a) . However, several 
studies have suggested that the satellite galaxies are less cen- 
trally concentrated than the dark matter, corresponding to 
71 > 1 (e.g., Yang et al. 2005; Chen 2008; More et al. 2009a). 
On the other hand, in the case of very massive galaxies, in 
particular luminous red galaxies, there are strong indica- 
tions that they follow a radial profile that is more centrally 
concentrated (i.e., 7Z < 1) than the dark matter (e.g., Mas- 
jedi et al. 2006; Watson et al. 2010, 2012; Tal, Wake & van 
Dokkum 2012). In Paper III we therefore examine how the 
results depend on changes in 7Z. 



3.7 Halo Occupation Statistics 

As specified in §2.2, the halo occupation statistics (N C \M) 
and (N B \M), required to describe the galaxy auto-correlation 
function and the galaxy-matter cross-correlation function, 
are obtained from the CLF, 



$(L|M) = $ C (L|M) + $ S (L\M) . 



(72) 



We use the CLF model presented in Cacciato et 
al. (2009), which is motivated by the CLFs obtained by 
Yang, Mo & van den Bosch (2008) from a large galaxy group 
catalog (Yang et al. 2007) extracted from the SDSS Data Re- 
lease 4 (Adelman-McCarthy et al. 2006). In particular, the 
CLF of central galaxies is modeled as a log-normal: 



$ C (L|M) dL : 



log e 



2-7T a c 



exp 



(logi- logic) 2 



2 at 



and the satellite term as a modified Schechter function 

t 2 

exp 



$ s (L|M)dL = 0: (A) QS+1 exp -(A) 



which decreases faster than a Schechter function at the 
bright end. Note that L c , a c , </>*, a s and L* are all func- 
tions of the halo mass M. 

Following Cacciato et al. (2009), and motivated by the 
results of Yang et al. (2008) and More et al. (2009a, 2011, 
we assume that a c , which expresses the scatter in logL of 
central galaxies at fixed halo mass, is a constant (i.e. is in- 
dependent of halo mass and redshift). In addition, for L c , 
which is defined such that log L c is the expectation value 
for the (10-based) logarithm of the luminosity of a central 
galaxy, i.e. 



f-OO 

logL c = / $ c (i|M) logLdL, 
Jo 

we adopt the following parameterization; 



L C (M) = L 



(M/M1) 71 



[1 + (M/Mi 



(75) 



(76) 



Hence, L c oc M 71 for M < Mi and L c oc M 72 for 
M ^> Mi. Here Mi is a characteristic mass scale, and 
Lo = 2 71 12 L c (Mi) is a normalization. 
For the satellite galaxies we adopt 



L 3 *(M) = 0.562L C (M) . 

a s (M) = a s 

and 



(77) 
(78) 
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log[r] (h-'Mpc) 

Figure 1. The halo- halo (top panels) and halo-matter (bottom panels) two-point correlation functions for haloes in three mass bins, as 
indicated in the top panels [values in square brackets in indicate log(Af/(/i _1 Mq)]. Colored symbols reflect the results obtained from 
the L250 simulation box. Errorbars (from Poisson statistics) are indicated, but since they are almost always smaller than the symbols, 
they can only be seen for 2 or 3 data points. The various curves are analytical results for three different values of ip, as indicated in the 
lower left-hand panel. Note that the model with tp = 0.9 accurately reproduces the sharp feature in £hm( r )i which reflects the 1-halo to 
2-halo transition regime. 



Iog[0 s *(M)] = feo + 6i(logMi 2 ) + b 2 (logMi 2 ) 2 , (79) 

with M12 = M/(10 12 /i _1 Mq). Note that neither of these 
functional forms has a physical motivation; they merely were 
found to adequately describe the results obtained by Yang 
et al. (2008) from the SDSS galaxy group catalog. 

Our parameterization of the CLF thus has a total of 
nine free parameters 

A C lf = (log Mi, log L , 71, 72, <r c , a s , 60,61,62) (80) 

The final parameter used to describe the halo occupa- 
tion statistics of the galaxies is Ap, defined in Eq. (39). 
In our fiducial model, adopted here, we will keep this pa- 
rameter fixed at Ap = 1, which corresponds to assuming 
that satellites follow Poisson statistics. As shown in Yang et 
al. (2008), this assumption has strong support from galaxy 
group catalogs. Additional support comes from numerical 
simulations which show that dark matter subhaloes (which 
are believed to host satellite galaxies) also follow Poisson 
statistics (Kravtsov et al. 2004). However, there are also 
some indications that the occupation statistics of subhaloes 
and/or satellite galaxies are actually slightly super- Poisson, 
i.e., ,4p > 1 (e.g., Porciani, Magliocchetti & Norberg 2004; 



Giocoli et al. 2010a; Busha et al. 2011; Boylan-Kolchin et 
al. 2010). Hence, in Paper III we will also discuss models in 
which Ap is taken to be a free parameter. 



4 MODEL TESTS 

In this section we describe the construction of large mock 
galaxy distributions, which we use to calibrate and test 
the real-space galaxy-galaxy and galaxy-matter correlation 
functions computed using the method outlined in §2.2. In 
particular, we calibrate the scale dependence of the halo 
bias and test the accuracy of our halo-exclusion treatment, 
which we compare to some approximate methods that do 
not account for halo exclusion but that are frequently used 
in the literature. In addition, we also use these mock galaxy 
distributions to test our correction for residual redshift space 
distortions. 

4.1 Construction of Mock Galaxy Distributions 

For testing and calibrating the method described in §2 we 
use two different iV-body simulations that have been run us- 
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Figure 2. Top panels show the galaxy-galaxy two-point correlation functions for three different magnitude bins, as indicated in the top 
panels [values in square brackets indicate 0,1 M r — 5 log h] . Colored symbols reflect the results obtained from the mock galaxy distribution 
in the L250 simulation box, while the solid line is the prediction of our analytical model. The middle panels show the contributions from 
the 1-halo central-satellite term (purple symbols, labeled 'lh[cs]'), the 1-halo satellite-satellite term (blue symbols, labeled l lh[ss]'), and 
the 2-halo term (green symbols, labeled '2h'). Once again, the solid lines show the model predictions. As in Fig. 1, errorbars reflecting 
Poisson statistics are indicated, but are almost always smaller than the symbols. The bottom panels, show the fractional difference 
between model and mock for the total correlation functions shown in the top panels. The dark and light shaded areas indicate fractional 
errors of less than 5 and 10 percent, respectively. As is evident, the accuracy of our model is typically better than 5 percent, and always 
better than 10 percent. 



ing the adaptive refinement technique (ART) of Kravtsov, 
Klypin & Khokhlov (1997). Both simulations have been used 
by Tinker et al. (2008, 2010) in their studies of the halo mass 
function and halo bias function, where they are called L250 
and L1000W. We adopt the same nomenclature in what fol- 
lows. 

Simulation L250 follows the evolution of 512 3 dark mat- 
ter particles in a cubic box of 250/i -1 Mpc size in a flat 
ACDM cosmology with matter density fi m = 0.3, baryon 
density fib = 0.04, Hubble parameter h = 0.7, spectral in- 
dex n s = 1.0, and a matter power spectrum normalization 
of as = 0.9. Simulation L1000W follows the evolution of 



1024 3 dark matter particles in a lh,- 1 Gpc size box in a flat 
ACDM cosmology with matter density fi m = 0.27, baryon 
density Qb = 0.044, Hubble parameter h — 0.7, spectral in- 
dex n B = 0.95, and a matter power spectrum normalization 
of as — 0.79. The particle masses are m p = 9.69xlO 9 /i _1 M 
and m p = 6.98 x 10 10 ^ -1 M for L250 and L1000W, respec- 
tively. 



For both simulations we use the halo catalogs at z = 
0, kindly provided to us by Jeremy Tinker. These haloes 
are defined as spheres with an overdensity of 200, which 
is identical to our definition of halo mass (see §2.1). More 
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Figure 3. Same as Fig. 2 but now for the galaxy-matter cross correlations. In the middle row of panels, the 1-halo component is 
split in the central-matter (purple symbols, labeled 'lh[cm]') and satellite-matter (blue symbols, labeled l lh[sm]') parts. Similar to the 
galaxy-galaxy correlation functions, the accuracy of our model is typically better than 5 percent, and always better than 10 percent. 



information about these simulations and the identification of 
its dark matter haloes can be found in Tinker et al. (2008). 

In what follows we will use the L250 simulation box 
to calibrate and test our galaxy-galaxy and galaxy-matter 
correlation functions, while L1000W is used to test our 
correction for residual redshift space distortions. To this 
end, we construct mock galaxy distributions by populat- 
ing the dark matter haloes with model galaxies using the 
CLF. In particular, we model the CLF using the parame- 
terization described in §3.7 with the following parameters: 
L = W 9A> h- 2 L®, Mi = lO 10 ' 9 ^" 1 M®, a c = 0.16, 71 = 5.0, 
72 = 0.24, a s = -1.3, 6 = -1.2, 61 = 1.4, and b 2 = -0.17. 
For each halo we first draw the luminosity of its central 
galaxy from <& CBn (L\M), given by Eq. (73). Next, we draw 
the number of satellite galaxies, under the assumption that 
P(Ns*t\M) follows a Poisson distribution (i.e., Ap = 1.0) 
with mean 



<iV sa t|A-/}= J $ sat (L|A/)dL, (81) 

where we adopt a luminosity threshold, Lmin, corresponding 
to 0A M r - 5 log h = -18 (here 0,1 M r indicates the SDSS 
r-band magnitude, if-corrected to z = 0.1; see Blanton 
et al. 2003). For each of the N aa .t satellites in the halo of 
question we then draw a luminosity from the satellite CLF 
$ sat (L|M), given by Eq. (74). 

Having assigned all mock galaxies their luminosities, the 
next step is to assign them a position and velocity within 
their halo. We assume that the central galaxy resides at rest 
at the center of the halo, while satellite galaxies follow a 
spherically symmetric number-density distribution propor- 
tional to Eq. (71) with TZ = 7 = 1, i.e. we assume that 
satellite galaxies are unbiased with respect to the dark mat- 
ter. For the halo concentrations we adopt the concentration- 
mass relation of Maccio et al. (2007), properly converted 



Cosmological Constraints from Clustering & Lensing 13 



0.4 
0.2 



-0.2 - 



-0.4 




0.4 - 



0.2 - 



-0.2 - 



-0.4 - 



-i — i — i — i — | — i — i — i — i — | — i — i — i — i — | — r 
[-18,-19.5]. 



-linear 

no-exclusion 



~- 



I ■ + 



-i — i — i — | — i — i — i — i — | — i — i — i — i — | — r 
[-19.5,-21]_ 



(n m .a B ) = (0.30,0.90) 

J I I I | I I I I | I I I I | I t I I I I | I I I I | I I I I | I t I I I I | I I I I | I I I I | I 




l ■ + 



t — i — i — i — | — i — i — i — i — | — i — i — i — i — | — r 
[-21,-22.5]_ 



(n m .o 8 ) = (0.24.0.74) 

J I I I I I I I I I I L 



-2 



1-2-1 1 
log[r/(h-'Mpc)] 



1 



1 



Figure 4. The fractional errors of the approximate 'no-exclusion' model (solid lines) and 'linear' model (dashed lines). Results are shown 
for three magnitude bins, as indicated, and for two different cosmologies+CLF. In the upper panels we use the same cosmology and CLF 
as for the mocks in Figs. 1 - 3; in the lower panels we use the WMAP3 cosmology and the corresponding best-fit CLF model of Cacciato 
et al. (2009) The dark and light shaded areas indicate fractional errors of less than 5 and 10 percent, respectively. Note that both the 
'no-exclusion' model and the 'linear' model have fractional errors that can easily exceed 30-40 percent, which is inadequate for precision 
cosmology. 



to our definition of halo mass. Finally, the peculiar veloci- 
ties of the satellite galaxies are assigned as follows. We as- 
sume that satellite galaxies are in a steady-state equilibrium 
within their dark matter potential well with an isotropic dis- 
tribution of velocities with respect to the halo center. One 
dimensional velocities are drawn from a Gaussian 



f{vj) = 



2lV (T S at (r) 



exp 



(82) 



with Vj the velocity relative to that of the central galaxy 
along axis j and a S a,t(r) the local, one-dimensional velocity 
dispersion obtained from solving the Jeans equation (see van 
den Bosch et al. 2004; More et al. 2009b). 

For reasons that will become clear below, in both sim- 
ulation boxes we only populate dark matter haloes with 
masses in the range M m i n < M < M max , where M m i n = 
lO 12 /^ 1 M s and lO 13 /^ 1 M for L250 and L1000W, re- 
spectively, while M max = 10 14 ' 5 /i _1 M for both L250 and 
L1000W. 



4.2 Calibrating Scale Dependence of Halo Bias 

As discussed in §3.4, fitting function (65) for the radial bias 
is likely to be inaccurate on small scales due to the fact that 
it was calibrated for a different halo definition than the one 
used here. To investigate the magnitude of this effect, and 



to test plausible corrections for it, we compare our model 
predictions against the L250 simulation box. 

We start by computing both the halo-halo auto- 
correlation function, (,hh(r\M) and the halo-matter cross- 
correlation function, £hm(r|M), for a number of bins in 
halo mass. We only consider haloes in the mass range 
to 12 /!" 1 M Q < M < lO 14 - 5 /!" 1 M©. The lower limit is 
needed to account for the fact that the simulation has a 
finite mass resolution, while the upper limit is adopted to 
be less sensitive to cosmic variance originating from the rel- 
atively small volume of the simulation box. Over the mass 
range 10 12 /i _1 M < M < 10 14 - 5 /i _1 M the halo mass 
function is in excellent agreement with the fitting func- 
tion of Tinker et al. (2008), which is also the one used in 
our analytical calculations. Note that when cross-correlating 
the haloes with the dark matter particles, we only con- 
sider the particles associated with haloes in the mass range 
lO 12 /^ 1 M < M < 10 14 ' 5 /!," 1 M : A large fraction of all 
dark matter particles in the simulation box are not associ- 
ated with any dark matter halo, but that is simply a mani- 
festation of the limited (mass and force) resolution of the N- 
body simulation. In other words, the L250 simulation does 
not properly resolve (non-linear) structure on a mass scale 
M < 1O 12 /i _1 M , and we therefore do not expect our model 
to accurately reproduce the halo-matter cross correlation 
function of the simulation if the cross correlation is with all 
dark matter. 
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The resulting £ hh (r\M) and £hm(r\M) are shown as 
filled circles in the upper and lower panels of Fig. 1, re- 
spectively. The blue, dashed lines are our model results, 
which are obtained using the same model as for the galaxy- 
galaxy and galaxy-matter correlation functions described in 
§2.2, but by setting (iV c |M) = 1 if the halo mass M falls 
within the halo mass bin in consideration, and (iV c |M} = 
otherwise, plus (N a \M) = for all M. Note that all in- 
tegrals over halo mass are only integrated over the range 
10 12 ft _1 M Q < M < 10 14 ' 5 ft _1 M Q . Also, when Fourier 
transforming the power-spectrum to obtain the correlation 
function, we adopt a lower limit for the wavenumbers in 
order to account for the fact that the simulation box has 
a finite size and periodic boundary conditions: specifically, 
in Eq. (42) we replace the lower limit of the integration 
range by fc m i n = y/3 x (2-k / L bo}c ) . In this model we have 
set tp — +oo, which implies that we have simply adopted 
the radial bias function of Tinker et al. (2005) without any 
modification (i.e., (,(r,z) — £o(r, z)\ see §3.4). 

The model accurately fits the halo-matter cross corre- 
lation functions on both small and large scales. The for- 
mer indicates that our modeling of the halo density profiles, 
u{r\M), is accurate (i.e., we are not making a significant 
error because we do not account for halo triaxiality, halo 
substructure and scatter in halo concentration; see §3.5), 
while the good fit on large scales argues that our treatment 
of halo bias is adequate. However, the model clearly under- 
predicts ^ m (r) at the 1-halo to 2-halo transition regime, 
which is especially conspicuous in the lower mass bin (lower 
left-hand panel of Fig. 1). The upper panels clearly indicate 
that this is a reflection of the fact that the model under- 
predicts the halo-halo correlation function on small scales 
(~ lft -1 Mpc; just before halo exclusion sets in). The solid 
and dotted lines are models in which we have used our modi- 
fied version of the radial bias function (Eq. [66]) with tp — 0.9 
and 0.6, respectively. The former provides the best-fit over- 
all; it somewhat overpredicts the halo-halo correlation func- 
tion on small scales in the lowest mass bin, but results in 
excellent fits to the other correlation functions. The model 
with tp = 0.6, on the other hand, clearly overpredicts the 
small scale clustering of the dark matter haloes for all mass 
bins. Detailed tests, including additional halo mass bins and 
other functional forms for a modified f(r, z), indicate that 
Eq. (66) with tp = 0.9 yield the best results, while still al- 
lowing for a sufficiently fast numerical evaluation. We have 
also experimented with the modification suggested by Tin- 
ker et al. (2012; see their Appendix A), which is identical to 
Eq. (66), except that they adopt r$ = r2oo(Afi) + r2oo(M2) 
rather than Eq. (67). Not only do we find this method to 
be less accurate, especially for the lower mass bins, but the 
dependence of on halo mass also makes the evaluation of 
Q(k\M 1 ,M 2 ,z) more CPU intensive. 

Note though, that there is no guarantee that tp — 0.9 
is also the best-fit parameter for any cosmology other than 
the one considered here. Hence, if we simply adopt ip = 0.9 
when trying to constrain cosmological parameters, we might 
introduce an unwanted systematic bias. Fortunately, as we 
demonstrate in Paper II, ip is only weakly degenerate with 
the cosmological parameters; most of its degeneracy is with 
the parameters that describe the satellite CLF. Hence, errors 
in tp may result in systematic errors in the inferred satellite 
fractions, but will not significantly bias our constraints on 



cosmological parameters. Nevertheless, in order to be con- 
servative, we will marginalize over uncertainties in tp when 
fitting for cosmological parameters (see Paper III). 



4.3 Testing Halo Exclusion 

Having calibrated the scale dependence of the halo bias, 
we now proceed to test the accuracy of our model in cal- 
culating £ gg and £ gm , focusing in particular on the accu- 
racy of our treatment of halo exclusion. Using the mock 
galaxy distribution (hereafter MGD) of the L250 simula- 
tion box, we first compute the real-space correlation func- 
tion for three different luminosity bins. The orange filled 
circles in the upper panels of Fig. 2 show the results thus 
obtained. In the panels in the middle row, we show the 
contribution to £gg( r ) from the 2-halo term (green filled 
circles), the 1-halo central-satellite term (purple filled cir- 
cles) and the 1-halo satellite-satellite term (blue filled cir- 
cles). In the high-luminosity bin (right-hand panels), the 
galaxy-galaxy correlation function is dominated by the 1- 
halo central-satellite term on small scales (r < 0.3ft -1 Mpc), 
and by the 2-halo term on large scales (r >, 1.0ft - Mpc). On 
intermediate scales, the 1-halo satellite-satellite term dom- 
inates. Note how this term becomes more and more dom- 
inant for less luminous galaxies; in fact in the lowest lu- 
minosity bin considered here (left-hand panels), the 1-halo 
satellite-satellite term completely dominates the signal for 
r < 1ft -1 Mpc. This reflects the fact that the satellite frac- 
tion increases drastically from / sat ~ 0.136 for the brightest 
bin, to / sa t ~ 0.465 for the intermediate luminosity bin, to 
/ sa t ~ 0.996 for the faintest bin. Note, though, that these 
satellite fractions are unrealistic due to the adopted cut- 
offs in halo mass at M = 10 12 ft -1 M and lO 14 ' 5 /!" 1 M . 
For example, for the CLF adopted here, virtually all central 
galaxies with r-band magnitudes (K-corrected to z = 0.1) 
in the range —18 > 01 M r — 5 log ft > —19.5 reside in haloes 
with M < 10 12 ft _1 Mq, which are not accounted for in our 
MGD; hence, almost all mock galaxies in this magnitude 
range are satellites. For comparison, if we were to integrate 
our CLF over the entire mass range from M — to M — oo, 
the corresponding satellite fractions, given by 



/sat(£l, L2) 



j^ 2 dL J °° & a (L\M) n(M) AM 



(83) 



are equal to / sat = 0.334, 0.253, and 0.167 from the faintest 
to the brightest bin, respectively. Although the trends seen 
in Fig. 2 are stronger than what is expected in reality, we 
consider the fact that the dynamic range in / sat covered is 
unrealistically large beneficial for the purpose of testing the 
accuracy of our model. 

The solid lines in the panels in the upper and mid- 
dle rows of Fig. 2 are the analytical results obtained using 
our fiducial model with halo exclusion and with tp — 0.9. 
Here we have adopted the same cosmology, redshift and 
CLF parameters as for the MGD. Note that, once again, 
all integrals over halo mass are only integrated over the 
range 10 12 ft _1 M Q < M < lO 14 ' 5 /^ 1 M , and we adopt 
fcmin = y/3 x (27r/Lbox) for the integration range in Eq. (42). 
Overall the agreement between our analytical prediction and 
the results from the MGD is extremely good. As is evident 
from the panels in the middle row, our treatment of halo ex- 
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Figure 5. The correction factor, /corr^p), that describes the effect of residual redshift space distortions that arise from the use of a 
finite integration range when computing the projected correlation function, i.e., from Eq. (45) with a finite r max - The shaded circles 
show the results obtained from the mock galaxy distribution in the L1000W simulation box with r max = 40ft _1 Mpc. Results are shown 
for the same three magnitude bins as in Figs. 2 - 4, as indicated. Dashed and solid curves correspond to the /corr(fp) obtained using the 
Kaiser formalism (see §2.3) with the linear and non-linear galaxy-galaxy correlation functions, respectively. The latter is in much better 
agreement with the mock results on small scales. See text for a detailed discussion. 



elusion nicely captures the sudden decline of the 2-halo term 
on small scales. Although the analytical 2-halo term becomes 
less accurate for r < 0.5/i -1 Mpc, mainly due to numerical is- 
sues, at these small scales the 1-halo term always dominates 
the total correlation function by at least an order of mag- 
nitude. Hence, this inaccuracy is of little practical concern. 
This is evident from the lower panels were we plot the differ- 
ence between the model prediction and the true correlation 
function in the mock, normalized by the latter, as function of 
radius. Over the entire range O.Olfo - Mpc < r < Wh' 1 Mpc 
the model predictions agree with the mock results to an ac- 
curacy of a few percent (typically < 5%). At the 1-halo to 
2-halo transition scale (r ~ lh~ Mpc), which has been noto- 
riously difficult to model accurately, the errors are somewhat 
larger but always stay below 10%. 

Fig. 3 shows the same as Fig. 2, but now for the galaxy- 
matter cross correlation, £ gm (r). Similar trends are evident; 
the model's 2-halo term becomes less accurate on small 
scales, but this has little to no impact on the quality of 
the model as is evident from the lower panels. As for the 
galaxy-galaxy correlation function, the model agrees with 
the simulation results at the few percent level. In particular, 
it is noteworthy that the model is accurate at better than 10 
percent on small scales. This indicates that non-sphericity of 
haloes, scatter in halo concentration, and halo substructure, 
all of which are ignored in our model, do not have a large 
( >, 10 percent) impact on the results (see §5 for a detailed 
discussion). 



4.4 Testing the Approximate Linear Model 

As we have demonstrated above, our implementation of halo 
exclusion and scale dependence of the bias are accurate at 
the few percent level. However, the required computation of 
Q(k\M 1 ,M 2 ,z), defined in Eq. (20), is fairly CPU intensive. 
The computation of w p (r p ) and AE(_R) for six luminosity 



bins (i.e., a single model; see paper III) takes ~ 20 seconds 
on a single (fast) processor. Consequently, the construction 
of an adequate Monte Carlo Markov Chain (which has to 
be large given that our model has anywhere from 14 to 19 
free parameters, depending on the priors used) takes several 
days to complete (on a single processor). Although this is 
not a major challenge in light of the fact that most desktop 
computers nowadays have multiple processors, it neverthe- 
less would be hugely advantageous if a much faster, approx- 
imate method could be found. In particular, the code can be 
made much faster if we were to ignore halo exclusion and/or 
the scale dependence of the halo bias. 

In this section we therefore investigate the pros (in- 
crease in speed) and cons (decrease in accuracy) of two 
different simplifications of our model. The first simplifica- 
tion is to ignore halo exclusion, i.e., we set r m i n = in 
Eq. (11). In that case we have that £hh(r, z|Mi, M2) = 
W{M\,z) bh{Mz,z) C(t", z) ^(r, z), and the two-halo term 
of the power spectrum (33) simplifies to 

P^(k,z) = b K (k, z)by{k,z) P ae (k, z), (84) 

where 'x' and 'y' are either 'c' (for central), 's' (for satellite), 
or 'm' (for matter), 

b K (k,z) = J ' dMU4k,M,z)n(M,z)b h {M,z) , (85) 

with U x {k,M,z) given by Eqs. (34)-(36), and 

P nc (k,z) = 4TT J C(r)£ mm (r,z)^Vdr. (86) 

This simplified model has the great advantage that it does 
not require the tedious and CPU intensive evaluation of 
Q(k\Mi, M2, z), causing a speed-up of a factor ~ 10, while 
still accounting for the scale dependence of the halo bias. In 
what follows we shall refer to this model as the 'no-exclusion 
model'. The solid lines in Fig. 4 show the relative error in 
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£ gg (r) of the no-exclusion model with respect to our fidu- 
cial model with halo exclusion. Results are shown for three 
magnitude bins, as indicated in the top panels, and for two 
different cosmologies/CLFs. In the upper panels we use the 
same cosmology and CLF as for the mocks described in §4.1. 
In the lower panels we use the WMAP3 cosmology, i.e., the 
cosmological parameters that best fit the three year data re- 
lease of the Wilkinson Microwave Anisotropy Probe (Spergel 
et al. 2007) and the best-fit CLF model for that cosmology 
obtained by Cacciato et al. (2009). The main motivation 
for showing results for two different cases is to emphasize 
that the fractional errors of the no-exclusion model may 
vary quite significantly from one cosmology and/or CLF to 
another. Clearly the no-exclusion model in general overpre- 
dicts the galaxy-galaxy correlation functions on small scales 
(r < 2k' 1 Mpc) by 20 to 50 percent T 

At the risk of further deteriorating the accuracy of the 
model, we can make additional simplifications by replacing 
P ne (k,z) in Eq. (84) by the linear matter power spectrum, 
Pmm(fc, A - This results in the 'linear' halo model, which has 
been used previously by numerous authors (e.g., Ma & Fry 
2000; Seljak 2000; Scoccimarro et al. 2001; Guzik & Seljak 
2002; Mandelbaum et al. 2005; Seljak et al. 2005; see also 
Cooray & Sheth 2002 and Mo et al. 2010). This removes the 
need for the integration (86) and therefore further speeds 
up the computation, albeit at the cost of ignoring the scale 
dependence of the halo bias. The dashed curves in Fig. 4 
show how these 'linear' galaxy-galaxy correlation functions 
compare to the fiducial model with halo exclusion and with 
scale dependence of halo bias. Somewhat surprisingly, for 
the cosmology+CLF shown in the upper panels, this linear 
model performs significantly better than the no-exclusion 
model, with errors that are always below 10 percent. This 
indicates that halo-exclusion and scale-dependence of halo 
bias have comparable but opposite effects on small scales 
(r < lft -1 Mpc), which may roughly cancel each other. The 
lower panels, however, show that this is not always the case, 
and that the linear model can significantly underestimate 
the galaxy-galaxy correlation functions (by as much as 30-40 
percent) in the 1-halo to 2-halo transition regime. In addi- 
tion, the linear model typically overpredicts the correlation 
power on large scales of ~ 10ft _1 Mpc by 10 percent. This 
is a well known effect that has already been discussed in nu- 
merous studies of the halo model (e.g., e.g., Ma & Fry 2000; 
Seljak 2000; Scoccimarro et al. 2001; Smith et al. 2003; Cole 
et al. 2005; Hayashi & White 2008). Finally we note that 
similar tests for the galaxy-matter cross correlation func- 
tions yield fractional errors for the no-exclusion and linear 
models that are very similar as for the galaxy-galaxy corre- 
lation functions shown in Fig. 4. 

Hence, despite the order of magnitude increase in com- 
putational speed, we conclude that both the 'no-exclusion' 
model and the 'linear' model suffer from systematic inac- 
curacies that can easily reach 30 to 40 percent, which we 
consider inadequate for the purpose of constraining cosmo- 
logical parameters. In Papers II and III we therefore exclu- 
sively use the much more accurate, but more CPU intensive, 



^ The sharp features apparent around 0.3ft 1 Mpc arc not due 
to numerical noise, but are real manifestations of halo exclusion. 



model described in §2 above, which properly accounts for 
both halo exclusion and scale dependence of the halo bias. 

4.5 Redshift Space Distortions 

As discussed in §2.3, the projected correlation functions used 
to constrain the models have been obtained using a finite 
range of integration along the line-of-sight. Consequently, 
they suffer from residual redshift space distortions (RRSDs) 
that need to be corrected for. In this section we investigate 
the magnitude of these RRSDs, as well as the accuracy of 
our correction method, which is based on the linear Kaiser 
formalism (Kaiser 1987). To that extent we use the mock 
galaxy distribution (MGD) obtained from the L1000W sim- 
ulation box, as described in §4.1. We first use this MGD to 
compute the projected correlation function, w p (r p ), for three 
luminosity bins, by integrating the corresponding £ gg (r p , r n ) 

out to r max = 40ft~ 1 Mpc 1 1 . Note that this is the same value 
of r max as used by Zehavi et al. (2011) for computing the 
projected correlation functions of faint galaxies in the SDSS 
DR4. Next we compute the same w p (r p ), but this time we set 
the peculia r velocit ies of all galaxies to zero, i.e., we simply 
set rv = yr 2 — r p , where r is the real-space separation be- 
tween two galaxies. The ratio of these two 'measurements' 
of the projected correlation function, shown as filled cir- 
cles in Fig. 5, indicates the error one makes in the estimate 
of w p (r p ) when ignoring the RRSDs, i.e., when computing 
w p (r p ) using 

w p (r p ) = 2 /^^(r) (87) 

with r out = \J r-p + ix- As discussed in §2.3, this is the 
standard method used by numerous authors in the past. 
The MGD results in Fig. 5 show that ignoring RRSDs 
causes an error in w p (r p ) that exceeds 10 percent on scales 
> 10ft _1 Mpc. Note, though, that in the MGD we only 
populated haloes in the mass range 10 13 ft _1 M© < M < 
10 14 ' 5 ft _1 Mq. As we show below, using the full mass range 
results in RRSDs that are even larger. 

The dashed line indicates the correction factor f corr 
given by Eq. (48). This correction factor is based on the 
Kaiser formalism for the linear velocity field, and is com- 
puted using the linear galaxy-galaxy correlation function 
given by Eq. (49). Note that the resulting / CO rr provides a 
fairly accurate description of the RRSDs resulting from us- 
ing a finite r max , at least at large scales. However, on small 
scales it clearly overpredicts / CO rr by a few percent. Hence, 
using this correction factor would overpredict w p (r p ) by a 
similar amount on small scales. 

The solid line shows the correction factor obtained by 
simply replacing £ g £(r) in Eq. (48) and Eqs. (51)-(55) by the 
non- linear version £gg( r )- Although the Kaiser formalism is 
strictly only valid in the linear regime, this simple modifica- 
tion works remarkably well; the model now accurately repro- 
duces the mock results on small scales. On larger scales, the 
model somewhat overpredicts f COII compared to the mock 
results. From the ratio between the two we estimate that 



II Here we have assumed that the plane-parallel approximation 
holds 
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Figure 6. The RRSD correction factor, /corr^p)) for different 
values of the integration range r max , as indicated. All these cor- 
rection factors have been obtained for galaxies with —21 < 01 
M r — 51og/i < —19.5, assuming the same cosmology and CLF 
as for the L1000W mock (i.e., similar to the middle column in 
Fig. 5). Note that / CO rr for r max = 40/i — 1 Mpc is larger than in 
the case of Fig. 5; this is due to the fact that here we integrate 
over all halo masses, whereas in Fig. 5 we only considered haloes 
with lO 13 /^ 1 M < M < lO 14 ' 5 /! -1 M Q in order to allow for 
a fair comparison with the mock results. Note also that even for 
f'max = 200/1" 1 Mpc the correction factor exceeds 5 percent for 
r p > 30k- 1 Mpc. 



the final error we make on w p (r p ) from the imperfect cor- 
rection for RRSDs is always less than 2 percent over the 
scales of interest. 

Finally, having demonstrated that f CO rr(r p , z), obtained 
using the non-linear galaxy-galaxy correlation function, pro- 
vides an accurate description of the RRSDs that arise from 
using a finite integration range, we can use it to pre- 
dict the magnitude of RRSDs for different values of r max . 
Fig. 6 shows /corr(fp) for five different values of r max , as 
indicated. Contrary to the results shown in Fig. 5, which 
only considered haloes in the mass range 10 13 /i -1 Mq < 
M < 10 14 ' 5 /i -1 Mq in order to allow for direct comparison 
with the mock results, the results in Fig. 6 have been ob- 
tained by integrating over all halo masses. Note that this 
results in / CO rr values for r max = 40/i _1 Mpc that are sig- 
nificantly larger than those in Fig. 5. In particular, using 
f max = 40/i~ 1 Mpc without a correction for RRSDs, under- 
estimates ui p (r p ) at r p = 20/i _1 Mpc by ~ 35 percent! Even 
when using r max = 200/i _1 Mpc, the RRSDs causes errors in 
the projected correlation function that exceed 5 percent for 
r p >, 30/i _1 Mpc. Clearly, correcting for RRSDs is extremely 
important, especially when using projected correlation func- 
tions to constrain cosmological parameters. The modified 
Kaiser method presented here corrects for these RRSDs to 
an accuracy of better than 2 percent. 



5 SHAPES, ALIGNMENT, SUBSTRUCTURE 
AND CONTRACTION OF DARK HALOES 

As discussed in §3.5, our model assumes that dark matter 
haloes are spheres with an NFW density profile. Clearly, 
this is a highly oversimplified picture. In reality, dark mat- 
ter haloes are triaxial, have substructure, and have a den- 
sity profile that may have been modified due to the ac- 
tion of galaxy formation. In addition, our model ignores 
the fact that there is significant scatter in the relation be- 
tween halo mass and halo concentration. After discussing 
how each of these effect impacts the accuracy of our over- 
simplified model, we show how we can take these shortcom- 
ings into account by marginalizing over the normalization of 
the concentration-mass relation of dark matter haloes. 



5.1 Halo Shapes and Alignment 

The assumption that dark matter haloes are spherical is in- 
consistent with expectations based on numerical simulations 
(e.g., Jing & Suto 2002; Bailin & Steinmetz 2005; Allgood 
et al. 2006) and/or non-spherical collapse conditions (e.g., 
Zel'dovich 1970; Icke 1973; White & Silk 1979). As shown by 
Yang et al. (2004), assuming that haloes are spherical un- 
derestimates the correlation function obtained if haloes are 
represented by FOF groups in numerical simulations by as 
much as ~ 20 percent on small scales (r ~ 0.1h~ Mpc). A 
similar test was recently performed by van Daalen, Angulo 
& White (2011), who basically came to the same conclusion. 
However, these tests of the impact of halo triaxiality are not 
directly applicable to our model. After all, our model uses 
halo mass functions and halo bias functions in which haloes 
are specifically defined as spherical volumes. Hence, a fair 
assessment of the impact of the non-spherical symmetry of 
dark matter haloes on our results should compare a correla- 
tion function in which it is assumed that all matter within 
the spherical volume of the halo has spherical symmetry 
(i.e., our model assumption) to one in which the dark mat- 
ter particles and galaxies within the same spherical volume 
are given a more realistic distribution that is not spherically 
symmetric. Note that this is not the same as a comparison of 
spherical haloes to FOF haloes, since the latter typically do 
not occupy a spherical volume. As demonstrated by More et 
al. (2012, in preparation), this yields correlation functions 
that only differ at the 5 to 10 percent level. Detailed theo- 
retical calculations by Smith & Watts (2005) reach a similar 
conclusion, that ignoring halo triaxiality only impacts the 
two-point correlation functions at the level of ~ 5 percent. 
This is also consistent with Li et al. (2009), who performed 
detailed tests that showed that non-sphericity of dark matter 
haloes has only a small effect of < 5 percent on the excess 
surface densities, and only on the smallest scales probed by 
the data. Hence, we conclude that our model assumption 
that haloes are spherical may underpredict both £ g g(r) and 
?gm(r ) on small scales (r < l/i -1 Mpc), but by no more than 
~ 10 percent. 

However, the fact that haloes have triaxial, rather than 
spherical shapes, also implies that another effect might in 
principle be important, namely halo alignment. Such poten- 
tial alignment between haloes is not accounted for in our 
model, which therefore might cause systematic errors in our 
two-point correlation functions. However, Smith & Watts 
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(2005) have shown that a strict upper bound for the effect 
of intrinsic alignment is a 10 percent effect on the two-point 
correlation function (corresponding to a scenario with max- 
imum alignment). Van Daalen et al. (2011) have shown that 
realistic amounts of alignment, as present in numerical sim- 
ulations of structure formation in a ACDM cosmology, has 
an effect on the correlation functions that is not larger than 
~ 2 percent. We therefore conclude that potential halo align- 
ment can be safely ignored. 

5.2 Halo Concentrations 

As discussed in §3.5, we assume that dark matter haloes 
have NFW density profiles with a concentration-mass rela- 
tion given by Maccio et al. (2007), properly converted to our 
definition of halo mass. This ignores, however, that there is 
a substantial amount of scatter in the concentration-mass 
relation. In particular, numerical simulations show that the 
concentrations, c, for haloes of mass M at redshift z follow 
a log-normal distribution 



P(c\M,z) dc - 



1 



2-IT (Tl n 



■ exp 



(In c — In c) 



2a? 

lnc 



dc 



where c = c(M, z) is the median halo concentration for a 
halo of mass M at redshift z, and ai nc ~ 0.3 (Jing 2000; 
Bullock et al. 2001; Wechsler et al. 2002; Sheth & Tormen 
2004; Maccio et al. 2007). Because of this scatter, the proper 
Hh(k\M, z) to use in the halo model is 



u h (k\M,z) = / u h (k\M, z,c)p(c\M, z)dc 



(89) 



(Giocoli et al. 2010b). However, in order to speed up 
the computations, we ignore this scatter and simply use 
Hh(k\M, z) — Hh(k\c(M,z)) instead. 

The impact of this oversimplification is shown in Fig. 7, 
where the symbols show u^(k\M, z)/uh(k\c(M, z)) — 1, with 
Uh(k\M, z) given by Eq. (89). Results are shown for three 
different values of <7i nc , as indicated, and are obtained us- 
ing M = 10 12 ft _1 Mq and c = 10. Taking the scatter 
in halo concentration into account boosts Hh(k) on small 
scales (k > Wh Mpc" 1 ) by an amount that increases with 
<Ti nc (see also Cooray & Hu 2001 and Giocoli et al. 2010b). 
For <Ti nc = 0.3 this boost is of the order of 10 percent. 
The solid lines in Fig. 7 show Uh(k\c) /Hh(k\c) — 1, where 
c = c(l + O.80f nc ). Although certainly not a perfect fit, 
this simple relation gives a reasonable description of the im- 
pact of ignoring the scatter in p(c\M,z). It shows that for 
fine = 0.3, the error made ignoring this scatter is similar 
to the error made if c(M, z) is underestimated by a factor 
1 + 0.8of nc — 1.07. This is comparable to the differences in 
the c(M, z) relation obtained by different authors (e.g., Eke, 
Navarro & Steinmetz 2001; Bullock et al. 2001; Maccio et 
al. 2007; Zhao et al. 2009). Hence, it is at least as impor- 
tant to obtain a more reliable calibration of the median of 
p(c\M,z) than to take account of its scatter. As we discuss 
in §5.5 below, because of these uncertainties, and because 
of other oversimplifications of our model, we will marginal- 
ize over the normalization of the concentration-mass rela- 
tion, c(M,z), when constraining cosmological parameters 
(see Paper III). The results shown here indicate that such a 
marginalization also captures the inaccuracies arising from 
the fact that we ignore the scatter in p(c\M). 



o.i - 



JO 



-0.1 - 




-2 2 4 

log(k) [h Mpc" 1 ] 

Figure 7. The ratio Uh(k\ M) / u\- l (k\c) — 1 as function of the 
wavenumber k for three different values of the scatter <r ^ nc in 
P(c\M), as indicated (open symbols). Here u(k\M) is the Fourier 
Transform of the average normalized density profile of NFW 
haloes of mass M, properly accounting for the non-zero scatter in 
P(c\M) (Eq. [89]), while «h(fc|5) is the normalized density profile 
for the median halo concentration, c. Hence, this ratio indicates 
the error made in ii\- l (k\M) when ignoring the scatter in halo con- 
centration. The solid lines show the same ratio, but this time 
u(k\M) is computed under the assumption of zero scatter, and 
using a concentration parameter c = 5(1 + 0.8(j 2 nc ). The reason- 
able agreement with the open symbols indicates that, to good 
approximation, one can mimic the effect of non-zero scatter in 
P(c\M) by simply computing ii(k\M) for a halo concentration 
that is a factor 1 + 0.8cr 1 2 nc larger than the median concentration. 



5.3 Halo Substructure 

Another oversimplification of our model is that we assume 
that dark matter haloes have a smooth density distribution. 
However numerical simulations of hierarchical structure for- 
mation have shown that haloes are not smooth, but have a 
significant population of dark matter subhaloes (e.g., Moore 
et al. 1998; Springel et al. 2001). Approximately 10 percent 
of the mass of a dark matter halo is associated with these 
subclumps, with a weak dependence on halo mass and cos- 
mology (e.g., Gao et al. 2004; van den Bosch et al. 2005b; 
Giocoli et al. 2008, 2010a). Since these subhaloes are be- 
lieved to host satellite galaxies, they will impact the galaxy- 
matter cross correlation function on small scales. Although 
formalisms to include dark matter substructure in the halo 
model have been developed (e.g., Sheth & Jain 2003; Gio- 
coli et al. 2010b), the implementation is numerically cum- 
bersome in that it adds a number of integrations, causing a 
very significant increase in the computation time per model. 
In addition, the model still involves a number of uncertain- 
ties, such as the density profiles of dark matter subhaloes. 

Fortunately, as shown by Mandelbaum et al. (2005), 
Yoo et al. (2006) and Li et al. (2009), the impact of sub- 
structure is negligible on the radial scales of interest, i.e., 
on the scales for which we currently have data on AE(i?) 
available (R >, 0.05/i -1 Mpc). Hence, we conclude that we 
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do not make significant errors by ignoring dark matter sub- 
structure. 



5.4 The Impact of Baryons 

Although numerical simulations of structure formation have 
established that dark matter haloes follow a universal profile 
that is accurately described by the NFW profile (Eq. [68]), 
this ignores the impact of baryons. During the process of 
galaxy formation, baryons collect at the center of the halo 
potential well and may subsequently be expelled due to feed- 
back processes. Because of the gravitational interaction be- 
tween baryons and dark matter, the dark matter halo will 
respond to this galaxy formation process. 

It is often assumed that the impact of baryons is to 
cause (adiabatic) contraction of the dark matter haloes (e.g., 
Blumenthal et al. 1986; Gnedin et al. 2004; Abadi et al. 2010; 
see also Schulz, Mandelbaum & Padmanabhan 2010; More et 
al. 2012b for observational support). However, it is also pos- 
sible for haloes to expand in response to galaxy formation; 
rapid mass-loss from the galaxy due to (repetitive) feed- 
back from supernovae and/or AGN (e.g., Pontzen & Gover- 
nato 2012), dynamical friction operating on baryonic clumps 
(e.g., El-Zant, Shlosman & Hoffman 2001; Mo & Mao 2004), 
and galactic bars (e.g., Weinberg & Katz 2002) all may cause 
dark matter haloes to become less centrally concentrated 
than their 'pristine' (i.e., without galaxy formation) coun- 
terparts. 

Interestingly, both galaxy rotation curves and galaxy 
scaling relations suggest that dark matter haloes are less 
centrally concentrated than what is expected in the ab- 
sence of baryonic processes in a CDM dominated universe 
(e.g., Swaters et al. 2003; de Blok et al. 2008; Dutton et 
al. 2007, 2011; Trujillo-Gomez et al. 2011). Although this 
may suggest that galaxy formation indeed results in a net 
halo expansion, it may also indicate that dark matter is not 
dark, but warm (e.g., Sommer-Larsen & Dolgov 2000) or 
self- interacting (e.g., Spergel & Steinhardt 2000). 

We conclude that the detailed density profiles of dark 
matter haloes carry a significant uncertainty, which needs to 
be accounted for. 



5.5 Marginalization 

All the effects discussed above, regarding halo shape, scatter 
in halo concentrations, halo substructure, and halo contrac- 
tion/expansion, impact the 1-halo terms of the correlation 
functions by either boosting or suppressing power on small 
scales. What is ultimately of importance for the accuracy of 
our models is the combined impact of all these effects. 

The combined impact of all effects except for that of 
halo contraction/expansion can be gauged from the lower 
panels of Fig. 3, which show that our model is consistent 
with the simulation results, in which the haloes have real- 
istic, triaxial density distributions, have substructure, and 
have non-zero scatter in the concentration-mass relation, to 
better than 10 percent. This test therefore confirms that our 
oversimplifications are accurate at the 10 percent level. 

We caution, though, that this test does not account for 
possible halo contraction/expansion due to baryons, whose 
impact is difficult to gauge in the absence of a more detailed 
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Figure 8. The impact on the galaxy-matter cross correla- 
tion function, £ gm (r) of multiplying the normalization of the 
concentration-mass relation, c(M), of dark matter haloes by a 
factor (1 + n), where n = ±0.1 (dashed lines) or n = ±0.2 (solid 
lines). Here we have, once again, adopted the same cosmology 
and CLF as for the mocks described in §4.1. 



understanding of galaxy formation. Hence, when constrain- 
ing cosmological parameters (see Paper III), we will take all 
these oversimplifications regarding the density distributions 
of dark matter haloes into account by marginalizing over the 
normalization of the concentration- mass relation, c(M, z). In 
particular, we introduce the parameter rj, so that the con- 
centration for a halo of mass M is given by (1 + n) x c(M, z), 
where c(M, z) is the average concentration-mass relation of 
Maccio et al. (2007), properly converted to our definition of 
halo mass. As a prior we assume that the probability distri- 
bution function (PDF) for r\ is given by 



P(V) 



27TCT, 



■ exp 



(90) 



where we adopt a v — 0.1. Fig. 8 shows the impact of r\ 
on the galaxy-matter cross-correlation function for galaxies 
with magnitudes in the range —18 > 01 M r — 51og/i > —19.5 
(results for other magnitude bins are very similar). The 
dashed and solid lines show the fractional changes in £gm(r) 
for r\ = ±0.1 and ±0.2, respectively, which correspond to 
the 68 and 95 percent confidence intervals of the prior PDF. 
Note how rj = ±0.2 modifies the one- halo term of £ gm (r) 
by more than 20 percent on small scales (r < O.lh^ 1 Mpc), 
which we argue is more than adequate to capture the inac- 
curacies in our model that arise from the various oversim- 
plifications discussed above (see Paper III for more details, 
and for a discussion of the posterior distribution of rj and its 
implications). 
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6 CONCLUSIONS 

Galaxies are abundant and visible to high redshifts, making 
them, in principle, excellent tracers of the mass distribu- 
tion in the Universe over cosmological scales. The problem, 
however, is that galaxies are biased tracers, and that this 
bias is a complicated function of scale, luminosity, morpho- 
logical type, etc. It is an imprint of the poorly understood 
physics related to galaxy formation. On sufficiently large 
scales, galaxy bias is expected to be scale-independent with 
a value that is known to depend on a variety of galaxy prop- 
erties such as luminosity and color (e.g., Norberg et al. 2001, 
2002; Zehavi et al. 2005, 2011; Wang et al. 2007). On small, 
(quasi) non-linear scales (r < 3fo _1 Mpc), galaxy bias be- 
comes strongly scale-dependent (e.g., Cacciato et al. 2012a), 
making it extremely difficult to infer any constraints on cos- 
mology, without having a proper, detailed method of either 
measuring this bias or marginalizing over it. For this rea- 
son, almost all studies to date that used the distribution of 
galaxies in order to constrain cosmological parameters have 
focused on large, linear scales, and treated galaxy bias as a 
'nuisance parameter' that needs to be marginalized over. 

In this paper, the first in a series, we have presented a 
new method, similar to that of Yoo et al. (2006) and Leau- 
thaud et al. (2011), that can simultaneously solve for cosmol- 
ogy and galaxy bias on small, non-linear scales. The method 
uses the halo model to analytically describe the (non-linear) 
matter distribution, and the conditional luminosity function 
(CLF) to specify the halo occupation statistics. For a given 
choice of cosmological parameters, which determine the halo 
mass function, the halo bias function, and the (non-linear) 
matter power spectrum, this model can be used to predict 
the galaxy luminosity function, the two-point correlation 
functions of galaxies as function of both scale and luminos- 
ity, and the galaxy-galaxy lensing signal, again as function 
of both scale and luminosity. These are all observables that 
have been measured at unprecedented accuracies from the 
Sloan Digital Sky Survey, and can therefore be used to con- 
strain cosmological parameters. 

In this paper we presented, in detail, our analytical 
framework, which is characterized by 

• a treatment for scale dependence of halo bias on small 
scales, using a modified version of the empirical fitting func- 
tion of Tinker et al. (2005). 

• a proper treatment for halo exclusion, similar to that of 
Smith et al. (2007), which is correct under the assumption 
that dark matter haloes are spherical. 

• a correction for residual redshift space distortions 
(RRSDs) using a slightly modified version of the linear 
Kaiser formalism. 

We have tested the accuracy of our analytical model using 
detailed mock galaxy distributions, constructed using high- 
resolution numerical iV-body simulations. We have shown 
that our analytical model is accurate to better than 10 per- 
cent (in most cases better than 5 percent), in reproducing 
the 3-dimensional galaxy-galaxy correlation and the galaxy 
matter correlation in the mock galaxy distributions over a 
wide range of scales (O.OSh^ 1 Mpc < r < 30/t -1 Mpc). In or- 
der to reach this level of accuracy we had to introduce, and 
tune, one free parameter that describes a modification of the 
empirical fitting function of Tinker et al. (2005) for the radial 



halo bias dependence. This modification is required because 
this fitting function is only valid for a particular definition 
of halo mass that is different than the one adopted here 
(see also Tinker et al. 2012). When fitting the data in or- 
der to constrain cosmological constraints, we will marginal- 
ize over uncertainties in this free parameter (see Papers II 
and III). We have demonstrated that ignoring halo exclu- 
sion and /or the scale dependence of the halo bias results in 
errors in £gg( r ) and £ gm (r) in the 1-halo to 2- halo transi- 
tion regime (r ~ l/i -1 Mpc) that can easily be as large as 
40 percent. The correction for RRSDs is necessary because 
projected correlation functions are always obtained by inte- 
grating along the line-of-sight out to a finite radius (typi- 
cally r max ~ 40 — 80h~ Mpc) rather than out to infinity. In 
agreement with the results of Norberg et al. (2009) , we show 
that not taking these RRSDs into account results in system- 
atic errors that can easily exceed 20 percent on large scales 
(r p > 10/i -1 Mpc), which can cause systematic errors in the 
inferred galaxy bias (see More 2011). As we demonstrate 
in Paper III, when unaccounted for these RRSDs can also 
result in significant systematic errors in the inferred cosmo- 
logical parameters. Fortunately, as we have demonstrated, 
it is fairly straightforward to correct for these RRSDs, to an 
accuracy better than ~ 2 percent, using a mildly modified 
version of the linear Kaiser formalism (Kaiser 1987). 

Finally, the good accuracy of our analytical model on 
small scales for the galaxy-matter and halo- matter cross cor- 
relation functions (better than 10 percent) indicates that ig- 
noring halo triaxiality, halo substructure, and scatter in the 
halo concentration-mass relation does not have a large im- 
pact, contrary to recent claims by van Daalen et al. (2011) 
who argue that halo triaxiality alone may cause inaccura- 
cies as large as 20 percent. We argue that this apparent dis- 
crepancy mainly owes to different definitions of dark matter 
haloes (see discussion in § 5.1). Nevertheless, we have shown 
that, in order to be conservative, one can take these inac- 
curacies that arise from oversimplifications of the halo mass 
distributions into account by marginalizing over uncertain- 
ties in the normalization of the concentration-mass relation 
of dark matter haloes. 

As indicated above, this is the first paper in a series. 
In Paper II (More et al. 2012a), we perform a Fisher ma- 
trix analysis to (i) investigate the strength of each of the 
datasets (luminosity function, projected correlation func- 
tions, and excess surface densities), (ii) identify various de- 
generacies between our model parameters, and (iii) forecast 
the accuracy with which various cosmological parameters 
and CLF parameters can be constrained with current data. 
In Paper III (Cacciato et al. 2012b) we apply our method 
to data from the Sloan Digital Sky Survey and present the 
resulting constraints on both cosmological parameters (fully 
marginalized over the uncertainties related to galaxy bias) 
and the CLF parameters (fully marginalized over uncertain- 
ties in cosmological parameters). 
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